home *** CD-ROM | disk | FTP | other *** search
Text File | 1992-10-27 | 115.3 KB | 3,265 lines |
- GA Toolkit
-
- Version 1.0
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Sara A. Lienau
-
- University of Missouri-Kansas City
-
- May 20, 1992
-
-
-
- TABLE OF CONTENTS
-
- {toc \f|1. INTRODUCTION 1
-
- Overview of Genetic Algorithms 1
-
- The GA Toolkit 3
-
- Overview of Documentation 4
-
- 2. INSTALLATION 5
-
- 3. USER INTERFACE PROGRAM 8
-
- Selecting GA Components 8
-
- Description of Available Components 9
-
- Selecting the Problem to Solve 12
-
- Description of Available Problems 12
-
- Creating an Executable GA Program 14
-
- Creating Input Files 15
-
- Creating the "in" File 16
-
- Description of Input Parameters 17
-
- Creating the "genes" File 20
-
- Defining a New Problem 24
-
- 4. EXECUTING YOUR GA PROGRAM 29
-
- Description of Output Files 30
-
- 5. PROGRAMMER'S REFERENCE 33
-
- Description of Classes 33
-
- Description of #define Directives 36
-
- Adding a New GA Component 38
-
- Adding a New Parameter 50
-
- REFERENCES 56
-
- }
-
- CHAPTER 1
-
- {tc "1. INTRODUCTION"}Introduction
-
- This document describes a software tool for researching and applying Genetic
- Algorithms (GAs), called the GA Toolkit. GAs are powerful search and
- optimization techniques based on the principles of natural selection and
- population genetics. This chapter provides a brief overview of Genetic
- Algorithms, describes the GA Toolkit and present an overview of the remainder
- of this document.
-
- {tc "Overview of Genetic Algorithms" \l 2}Overview of Genetic Algorithms
-
- GAs manipulate a population of sample points from the search space. Each
- sample point or individual in the population has several
- components--chromosome, phenotype and fitness value. The chromosome is a coded
- representation for a particular point in the search space. In biological
- systems, the chromosomes contains the coded instructions for creating all the
- features of an organism. Similarly, in GAs, the chromosome contains a coded
- representation for the various features or parameters for the problem the GA is
- attempting to optimize. A phenotype is the expression of the genetic material.
- For example, a gene on a human chromosome may contain the code for blue eyes
- and the human with blue eyes is the expression of this genetic material or the
- phenotype. In GAs, the phenotype corresponds to decoding the chromosome
- structure into components or parameters that are meaningful in terms of the
- problem the GA is attempting to solve. The GA treats the optimization problem
- as a black box--it applies a set up inputs to the problem and measures the
- overall performance (called the fitness of the solution) of the set of inputs.
- The GA does not require knowledge of nor does it understand the relationships
- that may exist between various parameters of the problem. The GA uses the
- fitness value of the sample points and the similarities in their coded
- representation (the chromosome) to search for better points in the search
- space.
-
- This concept comes from viewing the evolution of species as an optimization
- process. Darwin's "survival of the fittest" principle (also called natural
- selection) suggested that species become better adapted to their environment
- because only the "fittest" individuals are able to survive and reproduce
- passing their genetic code on to subsequent generations. The "fittest"
- individuals are those with a some type of beneficial features or
- characteristics that enables the creature to survive and reproduce in its
- environment. As individuals with various beneficial characteristics mate
- subsequent generations are more likely to contain individuals with combinations
- of advantageous features that were possessed by their parents. And over time
- the population as a whole becomes better suited for its environment by
- preserving and propagating the beneficial characteristics.
-
- GAs use the principle of natural selection to mate the better individuals in
- the population with a higher frequency. The mating process, called crossover,
- exchanges segments of the parents' coded chromosome to create new offspring.
- This process allows the GA the combine features of good solutions in the hope
- of developing better solutions to the problem.
-
- The following describes the basic execution cycle of a Genetic Algorithm.
-
-
-
- Initialization
-
- Evaluation
-
- repeat
-
- Selection
-
- Recombination
-
- Evaluation
-
- Replacement
-
- until ( termination condition satisfied )
-
-
-
- The Initialization phase randomly creates an initial population. At first the
- GA has no information about the search space, so randomly sampling a number of
- points from the search space is a good place to start the search process. In
- the Evaluation phase, each chromosome is decoded and a fitness value is
- assigned using the fitness function defined for the problem. In the Selection
- phase, individuals in the population are selected for mating based on their
- fitness with more fit individuals having a greater probability of being
- selected. During the Recombination phase, the individuals selected in the
- previous phase mate via a crossover procedure and exchange segments of their
- coded chromosome to create new offspring. In addition, during this phase, the
- new offspring or the parents may undergo mutation that slightly modifies their
- genetic code. Then the new offspring and any parents that were modified by
- mutation are re-evaluated as described above. And finally, the Replacement
- phase determines which offspring and members of the current population will
- survive into the next generation. This cycle repeats until a termination
- condition has been satisfied, either a sufficiently fit population member or a
- maximum number of reproductive trials. For a more detailed description of
- Genetic Algorithms see Goldberg (1989) or Holland (1975).
-
- {tc "The GA Toolkit" \L 2}The GA Toolkit
-
- The description of GAs above presents a general framework for these algorithms.
- There are actually various implementation for each phase, that have been
- proposed by the leading researchers in the field, each with varying degrees of
- effectiveness on different types of problems. With the variety of techniques
- available, it is difficult to package together a single GA configuration that
- is best suited for all problem domains. The GA Toolkit is a software tool
- designed to provide a better alternative for researching and applying Genetic
- Algorithms.
-
- The GA Toolkit provides several techniques for each phase and allows the user
- to select which component to use. The GA Toolkit will actually create an
- executable GA program with the components that the user selects. This system
- allows the user to create a variety of GA programs for solving different types
- of problems. In addition, the Toolkit was designed so that new GA components
- could easily be added to the system as better techniques are developed.
-
- This tool offers many benefits to a variety of users. For those users that
- want to apply GAs, the Toolkit provides a collection of the most successful
- techniques available. These users can gain the benefits of using GAs without
- detailed knowledge of the underlying theory and without developing their own
- program code. For GA researchers, this tool greatly increases the ability to
- explore the effectiveness of different techniques in a variety of problem
- domains. In addition, the GA Toolkit allows the researcher to experiment with
- a particular technique in a number of different contexts. For example, the
- researcher could observer the performance of a selection technique using
- different crossover and replacement methods and develop a better evaluation of
- the selection procedure. This ability to isolate the effectiveness of a
- particular method is essential for developing useful guidelines on which
- components should be used in different problem domains.
-
- {tc "Overview of Documentation" \L 2}Overview of Documentation
-
- The remainder of this document describes how use the GA Toolkit. Chapter 2
- describes the installation process. Chapter 3 discusses the user interface
- program that allows the user to select the desired components and create an
- executable GA program. This chapter also describes the GA components and the
- set of problems that are currently implemented in the GA Toolkit. Chapter 4
- discusses how to execute the GA programs you create with the Toolkit and
- describes the output files produced by a GA program. And finally, Chapter 5
- provides a reference for programmers who which to modify the system. This
- chapter describes the classes used to implement the GA techniques and discusses
- how to add new GA components to the Toolkit.
-
-
-
- Chapter 2
-
- {tc "2. INSTALLATION"}Installation
-
- To install the GA Toolkit you need two files: install and gat.tar. Currently
- these files are located in the /home/slienau/Toolkit directory on the Sun SPARC
- workstations. If this directory does not exist when want to install the GA
- Toolkit, check with Dr. Graham about the new location of these files. Once you
- have found the necessary files you need to copy them to your home directory.
- The install file is a Bourne shell script that will create a main directory for
- the GA Toolkit, un-archive the files in the gat.tar file and create the
- executable user interface program (gatuif). Use the following command to
- execute the installation program.
-
- % install [directory name]
-
- The optional parameter allows you to specify the name of the directory where
- you want the GA Toolkit installed. If this parameter is omitted the Toolkit
- will be installed a directory called GATool. During the installation process
- several messages will be displayed about the files that are being un-archived.
- When the installation is complete a message will be displayed on your terminal.
- This message will inform you that you need to add the information in the
- cshrc.gat file (located in the Toolkit's main directory) to your .cshrc file in
- your home directory. This file contains several environment variables that are
- necessary for the Toolkit to operate correctly. The following shows the
- contents of this file. Note: In your file, the GATOOL environment variable
- may have a different directory name.
-
-
-
- #
-
- # GA Toolkit Env Variables
-
- #
-
- setenv GATOOL $home/GATool
-
- setenv GATSRC $GATOOL/Source
-
- setenv GATFNS $GATSRC/Funcs
-
- setenv GATIN $GATOOL/Input
-
- setenv GATHLP $GATOOL/Help
-
- setenv GATUIF $GATOOL/Uif
-
- setenv GATBIN $GATOOL/SH_BIN
-
- set path = ($GATBIN $GATUIF $path)
-
-
-
-
-
- The installation program creates several sub-directories under the directory
- where the GA Toolkit was installed. The following briefly describes the
- contents of each directory.
-
-
-
- Source contains the source code for all the GA components.
-
- Source/Funcs contains the source code for the test problems provided with the
- Toolkit.
-
- Input contains input files for the test problems in the Source/Funcs directory.
-
- Help contains the help files for the user interface program.
-
- Uif contains the source code for the user interface program.
-
- SH_BIN contains several Bourne shell programs used by the user interface
- program.
-
-
-
- The library of components in the GA Toolkit were written in the C++ language.
- In order to create executable GA programs you must be able to execute the C++
- compiler. If you do not have the proper environment variable set up in your
- .cshrc file for the C++ compiler, then you will also need to add this
- information. The information you, currently, need to include in your .cshrc
- file for the C++ compiler to work properly is shown below.
-
-
-
- #
-
- # C++ Env Variables (sun C++)
-
- #
-
- setenv MANPATH /usr/share/man:/usr/local/man:/usr/man
-
- setenv CCROOTDIR /usr/local/lang/SC1.0
-
- setenv CCLIBDIR $CCROOTDIR
-
- setenv I $CCROOTDIR/include/CC
-
- setenv LIBRARY $CCROOTDIR/libC.a
-
- setenv cfrontC $CCROOTDIR/cfront
-
-
-
- set path = (/usr/local/lang/SC1.0 $path)
-
-
-
- After you have made the necessary modifications to your .cshrc file you should
- logout and login again so that the changes will take effect. Now you have
- completely installed the Toolkit and are ready to begin working with the user
- interface program described in the next chapter.
-
-
-
- Chapter 3
-
- {tc "3. USER INTERFACE PROGRAM"}User Interface Program
-
- The user interface program for the GA Toolkit allows the user to select a
- component for each phase of the GA, select the problem to solve, create the
- necessary input files, create an executable GA program and create a template
- for defining a new problem. This section describes how to uses the various
- features of the user interface program.
-
- To start the user interface program use the following command:
-
- % gatuif
-
- The name "gatuif" is an acronym for GA Toolkit User InterFace. After a few
- seconds the main menu will appear on your terminal. From the main menu, you
- can access all the major features of the user interface program, such as
- selecting GA components and creating an executable GA program. Pressing the
- return key will activate the highlighted choice. The following describes each
- of the options available on the main menu.
-
- {tc "Selecting GA Components" \L 2}Selecting GA Components
-
- The first step in creating an executable GA program is selecting which
- components to use for each phase of the GA process. Activating the Select GA
- Components option will bring up a screen divided into two parts. On the left
- side of the screen there is a menu of components and the right portion of the
- screen shows the current configuration--the component that is currently
- selected for each phase. To change the component for a particular phase, use
- the up and down arrows to select the category on the menu (the left side of the
- screen). Pressing the return key will bring up a list of available components
- or a sub-menu for the highlighted category. Once the desired technique is
- highlighted, pressing the return key will select that technique. To see a
- short description of a particular GA component highlight it by moving the up
- and down arrow keys and press the F1 key (or the help key for your terminal).
- When you are satisfied with the configuration of components, press CTRL-U to
- return to the main menu.
-
- This portion of the user interface program also manages any constraints on
- which components may used together. When you select a category from the menu
- (e.g., Selection or Replacement) only those techniques that are compatible with
- the current configuration will be displayed. This will prevent you from
- accidentally creating an invalid GA configuration.
-
- {tc "Description of Available Components" \L 3}Description of Available
- Components
-
- Chromosome--Representation - The chromosome representation is how the
- chromosome structure is implemented. Currently the Toolkit only provides one
- implementation of the chromosome structure, but others, such as diploid or
- two-dimensional chromosomes, could be added.
-
-
-
- 1-D Bit String - With this component, the chromosome is implemented as a one
- dimensional string of ones and zeros.
-
-
-
- Chromosome--Decoding - The chromosome decoding specifies how the chromosome
- string will be interpreted or decoded for the evaluation function.
-
-
-
- Binary - This component provides binary decoding of the chromosome string
- (i.e., converting the string of 1s and 0s to a decimal integer).
-
-
-
- Gray - This component allows the chromosome string to be interpreted as Gray
- code. With Gray code, adjacent integers (e.g., 7 and 8 or 8 and 9) have a
- Hamming distance of one.
-
-
-
- Reproduction - The reproduction style defines how many offspring are produced
- before the replacement phase is initiated.
-
-
-
- Generational - In generational reproduction, an entire generation of offspring
- are produced and then the replacement phase determines which offspring and
- parents will survive in the next generation.
-
-
-
- Individual - With individual reproduction, as soon an offspring is created the
- replacement phase selects a member of the population to be removed in order to
- make room of the new offspring. The concept behind individual reproduction is
- that the GA could adapt faster if offspring are made available as soon as they
- are produced rather than waiting for a complete generation.
-
-
-
- Selection - The selection technique determines which members will be given an
- opportunity to reproduce and pass their genetic code onto subsequent
- generations. In general, the selection technique will provide more
- reproductive trials to the more fit individuals of the population.
-
-
-
- SUS - In Stochastic Universal Sampling (SUS) (Baker 1987), each individual is
- allocated a slice on a spinning wheel according to its expected value. An
- individual's expected value is the ratio of the its fitness to average fitness
- of the population. The wheel contains N evenly spaced pointers, where N
- is the size of the population. The wheel is spun one time and the individual
- at each pointer is allocated a reproductive trial. With this selection
- technique, each individual is guaranteed to receive either {EMBED Equation |}
- or {EMBED Equation |} trials.
-
-
-
- GENESIS Rank - GENESIS Rank selection (Grefenstette 1990) is similar to SUS
- using a spinning wheel with N evenly spaced pointer. But the size of the
- slice on the wheel is determined by the individual's rank within the population
- rather than the actual fitness value.
-
-
-
- Linear Rank - Linear Rank selection (Whitley 1989) uses a linear function to
- bias the allocation of reproductive trials toward the better ranked
- individuals. Linear Rank selection provides a parameter, called Selection
- Bias, so that the degree of preference for the best members can be adjusted.
- For example, a Selection Bias of 1.5 means that the best individual is 1.5
- times more likely to receive a reproductive trial than the median ranked
- individual in the population.
-
-
-
- Replacement - The replacement phase determines which individuals will be
- replaced when new offspring are produced. Since the replacement phase is
- initiated at different times with the different styles of reproduction, there
- are several techniques that are appropriate for each type of reproduction.
-
-
-
- Replacement Techniques for Generational Reproduction
-
-
-
- DeJong's Gap w/Single Elitism - This technique implements DeJong's Gap model
- (1975) that allows overlapping population. The DeJong Gap specifies what
- percentage of the next generation should be created by the genetic operators
- (selection, crossover and mutation) and the remainder of the population is
- filled by randomly selecting, without replacement, individuals from the parent
- population. For example, a DeJong Gap of 0.8 would mean that 80% of the next
- generation would be new offspring and the remaining 20% would be randomly
- selected members from the parent population. This technique also provides a
- single elitism feature which guarantees that the best solution developed so far
- will always have a slot in the next generation.
-
-
-
- Variable Elitism - This replacement policy allows the user to specify the
- percentage of the population slots (by the Elite Gap parameter) to reserve for
- the best solutions developed so far and the remaining slots are filled by new
- offspring. With this technique, N offspring are created by the genetic
- operators (selection, crossover and mutation) and then the best
-
- N * Elite Gap individuals from among both the new offspring and parents
- receive a slot in the next generation. Any remaining population slots are
- filled by randomly selecting, without replacement, members from the set of new
- offspring. For example, an Elite Gap of 0.35 reserves 35% of the next
- generation for the best individuals and the remaining 65% is filled with new
- offspring.
-
-
-
- Replacement Techniques for Individual Reproduction
-
-
-
- Remove Worst - With this replacement policy, when an offspring is created the
- worst member of the population is replaced.
-
-
-
- Reverse Rank - This replacement policy uses a linear function to bias the
- selection for replacement towards the worst ranked members of the population.
- The Replacement Bias parameter can be adjusted to altered the pressure on the
- worst individuals. For example, a Replacement Bias of 1.2 means that the worst
- individual in 1.2 times more likely to be replaced than the median ranked
- individual in the population. In addition, the Elite Gap parameter can also be
- used to explicitly preserve a percentage of the best ranked individuals. For
- example, an Elite Gap of 0.1 with a population size of 50 means that the best
- 10% of the population (the 5 highest ranked members) cannot be replaced.
-
-
-
- Reverse Tournament - With this replacement technique, two members from the
- population are randomly selected and the least fit member is replaced by the
- new offspring.
-
-
-
- Generational Emulation - This replacement policy is attempting to emulate the
- replacement policy used in standard generational GAs while using individual
- reproduction. This policy monitors the trials allocated and used by each
- individual. When a member has exhausted its allocated trials or did not
- receive any trials because of low fitness it becomes a candidate for
- replacement. When an offspring is created, a randomly selected individual from
- the set of candidates is replaced. This replacement policy can only be used
- with selection techniques that explicitly calculate the number of trials each
- individual should receive (i.e., SUS and GENESIS Rank selection).
-
-
-
- Crossover--Style - The crossover style specifies how the desired number of cut
- points will be selected.
-
-
-
- Traditional - With traditional crossover the cut points are randomly selected.
-
-
-
- Reduced Surrogate - This crossover technique (Booker 1987) attempts to create
- offspring that differ from their parents whenever possible. This techniques
- notes the position on the chromosome string where the parents have different
- values (called the "reduced surrogate") and randomly selects cuts points
- between the first and last position in the reduced surrogate. For example,
-
-
-
- Parent1 11001100
-
- Parent2 10010110
-
-
-
- Reduced -1-01-0-
-
- Surrogate -0-10-1-
-
-
-
- A cut point any where after the second position and the before the seventh
- position would produce an offspring that differed from its parents.
-
-
-
- Uniform - With Uniform crossover (Syswerda 1989), the parents' value for each
- position on the chromosome string is swapped with a fixed probability to create
- new offspring. For example, if the swapping probability is 0.5, then for each
- position on the chromosome string we would flip a fair coin to decide whether
- the offspring should receive the bit value from the first or second parent.
-
-
-
- Crossover--# Cut Points - This category specifies the number of cut points that
- will be selected or the how many segments to exchange between parents when
- creating new offspring. You can select either one point or two point
- crossover.
-
-
-
- Mutation
-
- Canonical - Once a bit has be selected for mutation, this technique randomly
- chooses a new value for the bit from the set of possible values.
-
-
-
- Statistics - This category defines the statistics to collect and report about
- the GA experiment.
-
-
-
- Basic Stats - This technique provides information about population convergence,
- the best and average performance of the current population as well as on-line
- (average of all solutions tested) and off-line (average of the best solution
- for each generation) performance measures. See DeJong (1975) for a discussion
- of on-line and off-line performance measures.
-
- {tc "Selecting the Problem to Solve" \L 2}Selecting the Problem to Solve
-
- The Select Problem option on the main menu allows you to select the problem the
- GA will attempt to solve. Pressing the return key when Select Problem is
- highlighted on the main menu will bring up a screen divided into two parts.
- The top of the screen displays the name of the problem that is currently
- selected. The bottom portion of the screen lists all the problems that are
- currently defined in the system. To change the problem, use the up and down
- arrow keys to highlight the desired problem, in the bottom portion of the
- screen, and press the return key to select the problem. After you have
- selected the desired problem, press CTRL-U to return to the main menu.
-
- {tc "Description of Available Problems" \L 3}Description of Available Problems
-
- Counting 1s - In this problem, the GA is attempting to maximized the number of
- 1s in the chromosome string. The fitness value is the number of 1s in the
- chromosome string. (File func.ct1, minimization problem)
-
-
-
- Partially Deceptive - This is an artificially created partially deceptive
- problem. This problem consists of ten 4-bit sub-problems distributed at
- positions i , i + 10, i + 20, i + 30, for i = 1 to 10 on the chromosome
- string. For each 4-bit sub-problem, the order-1 and order-2 schemata are fully
- deceptive, but the order-3 schemata are not deceptive. (File func.d4,
- maximization problem) See Whitley (1991) for a discussion of deception in GAs.
-
-
-
- DeJong's F1 through F5 are the five problems from DeJong's test suite. (See
- DeJong 1975 for a description of these problems.)
-
-
-
- Ackley's Fig. 1-7 - This problem (Ackley 1987, 16-17) is basically a flat space
- with several very steep hills. The problem is defined by the following
- function: {EMBED Equation |}. (File func.fig7, maximization problem)
-
-
-
- Noisy Ackley's Fig. 1-7 - This problem is similar to Ackley's Fig. 1-7 except
- that a noise factor is used to alter the value of the function in about half
- the chromosomes evaluated. The noise factor is created by generating a random
- number in the range [0, 1) and using it as a percentage of how much the actual
- function value should be altered (i.e., noise factor = random number * function
- value). The GA randomly chooses to add or subtract the noise factor to the
- actual function value. (File func.nfig7, maximization problem)
-
-
-
- Noisy Shifted 3-D Parabola - This problem also incorporates a noise factor in a
- space defined by the following function: {EMBED Equation |}. This space is a
- 3-dimensional parabola shifted slightly off the origin. The noise factor is
- defined by the following: noise factor = (random number / 2) * function value.
- The GA randomly chooses to add or subtract the noise factor to the actual
- function value. On average, half of the function evaluation will be altered by
- the noise factor. (func.npar, minimization problem)
-
-
-
- Shifted 3-D Parabola - The space of this problem is a 3-dimensional parabola
- shifted slightly off the origin and defined by the following function: {EMBED
- Equation |}. (File func.par, minimization problem)
-
-
-
- Select/Replace Test - This problem was used to observe the characteristics of
- various combinations of selection and replacement techniques. The function
- used for this problem is the following: {EMBED Equation |}. (File func.repl,
- maximization problem)
-
-
-
- TSP-Swap - This is a 16 city Traveling Salesman Problem where we are attempting
- to find the shortest tour between 16 cities in the continental US traveling by
- automobile. In this problem, we use the information in the chromosome to tell
- us how to create a permutation of the list of cities to visit. Each pair of
- genes on the chromosome, tell us which two cities to swap to create a different
- tour and perhaps located the optimum tour. (File func.swap, minimization
- problem)
-
-
-
- TSP-New-Swap - This problem is similar to the TSP-Swap problem, but we
- interpret the information in the chromosome slightly different. Instead of
- using pairs of genes for swapping instructions, we use the gene position and
- the gene value. For example, if the first gene has a value of 5 then we swap
- the cities at the first and fifth position on our list, if the second gene has
- a value of 4 then we swap the cities at the second and fourth position on the
- list, and so on. (File func.swp, minimization problem)
-
- {tc "Creating an Executable GA Program" \L 2}Creating an Executable GA Program
-
- After you have selected the component to use for each GA phase and the problem
- the GA will be applied to, you can create the executable GA program. Use the
- up and down arrow keys to select the Make GA option from the main menu and
- press the return key. A dialog box will appear asking you to give a name for
- the executable GA program. After providing a name the system will begin
- re-compiling the necessary program code. This may take several minutes
- depending on which components were changed, so please be patient. When the
- executable program has been created a message will appear in the dialog box
- informing you that the program was successfully created. Pressing the return
- key after receiving the success message will return you to the main menu.
-
- Compilation error should not occur if you are using the components original
- provided with the GA Toolkit. However, if you have incorporated additional
- components or modified the components in the Toolkit compilation errors could
- occur if you have made an error in the program code. If errors were
- encountered during the re-compiling process a message will appear that will
- inform you of this event. This message will also tell you about a file that
- contains all the message produced during the re-compilation process. Pressing
- the return key while this message is displayed will return you to the main menu
- where you should exit from the user interface program. By examining the file
- containing the messages from the re-compilation process, you should be able to
- locate where the error occurred and be able to correct it.
-
- Before you can execute the program you have just created you need to create
- several input files. This process is described in the next section. In
- addition, you should refer to Chapter 4 for a discussion of how to execute your
- GA program and a description of the output files created by the GA program.
-
- {tc "Creating Input Files" \L 2}Creating Input Files
-
- Before you can execute the GA program you will need to create two input files.
- The in file contains various user defined parameters for controlling the GA
- experiment such as the maximum number of trials and the crossover and mutation
- probabilities. The genes file contains information on how to interpret the
- chromosome string for the problem the GA is attempting to solve. For example,
- the genes file specifies the number of genes in the chromosome and the format
- for displaying the gene values. This section describes how to create these
- input file through the user interface program. In addition, this section also
- briefly describes the contents of each file.
-
- You begin the input file creation process by selecting the Create Input Files
- option on the main menu and pressing the return key. This will bring up a
- dialog box asking you to provide a file suffix for the input files. When you
- execute your GA program, the program will, by default, look for the "in" and
- "genes" input files. However, providing the GA program with the file suffix
- will cause the program to read the information from the input files with the
- specified suffix. For example, if you use the suffix "p1" (for problem 1) then
- the GA program will take the input information from the "in.p1" and "genes.p1"
- input files. Using a file suffix is helpful if you want to experiment with
- several problems with different input parameters and different chromosome
- formats. If you want to use a file suffix, then type a suffix in the space
- provided and press the return key. Otherwise, just press the return key to
- accept the default value (no file suffix).
-
- Then the Setup Input Files screen is displayed. This screen allows you to
- choose which file (in or genes) to create. Use the up and down arrow keys to
- highlight the desired option and press the return key to select the option.
- The following describes how to create the in file and describes the various
- parameters contained in this file. The Creating the "genes" File section will
- describe the genes file and its contents.
-
- {tc "Creating the \"in\" File" \L 3}Creating the "in" File
-
- Selecting the in file option will bring up a screen divided into two parts.
- The left side of the screen contains a menu with Edit and Save options. The
- right side of the screen lists the various user-defined parameters and default
- values for each. The default values are not the product of any scientific
- study to develop optimum parameter values--they are only reasonable settings
- for each parameter. Depending on the type of problem you want to solve the
- default values may or may not be appropriate. Feel free to change the value of
- any or all parameters.
-
- To edit the values of the parameters, press the return key when the Edit option
- is highlighted. A parameter on the right side of the screen will be
- highlighted (usually the first parameter on the screen). Use the up and down
- arrow keys to highlight a parameter. The SHIFT-U and SHIFT-D key combinations
- can also be used to scroll up or down an entire screen. Pressing the return
- key will bring up a dialog box asking you to enter a value for the highlighted
- parameter. If you enter an invalid value for the parameter a dialog box will
- appear explaining the appropriate values. After you have read this
- information, pressing the return key will allow you to return to the previous
- dialog box and enter an appropriate value. You can change as many parameters
- as you want by highlighting the parameter and pressing the return key. If you
- want more information about the parameters, pressing the F1 key will bring up a
- help dialog box that contains short descriptions of all of the parameters.
- When you have finished adjusting the parameter values, use the CTRL-U key
- combination to return to the menu on the left side of the screen. Then you can
- select the Save option to save your parameter settings. To exit from the in
- file creating section press CTRL-U again and you will return to the Setup Input
- Files screen that contains the menu where you selected which file to create (in
- or genes).
-
- {tc "Description of Input Parameters" \L 3}Description of Input Parameters
-
- Total Experiments - This is the number of experiments you would like to conduct
- with this set of input parameters. That is, the number of complete GA runs
- where the GA converges to solution or exceeds the maximum trials. In each
- experiment, the GA starts with the same initial population and then the random
- number generator is seeded with a different value so that a different set of
- choices will be made. Total Experiments should be an integer number greater
- than zero.
-
-
-
- Total Trials - This is the maximum number of trials you want to occur before
- the GA is stopped. Total Trials should be an integer number greater than zero.
-
-
-
- Population Size - This is the number of chromosomes or sample points of the
- search space you want the system to work with. Population Size should an even
- integer number greater than zero. (The number must be even because two parents
- are selected and always produce two offspring.)
-
-
-
- DeJong Gap - This is used with the DeJong Gap with Single Elitism replacement
- policy. This parameter should be a real number in the following range [0.0,
- 1.0]. The Gap measure specifies what percentage of the population should be
- created via the genetic operators (selection, crossover and mutation) and the
- remaining portion of the population are randomly selected members of the
- previous generation. For example, a Gap of 0.8 would mean 80% of new
- generation is created by selection, crossover and mutation and the remaining
- 20% are copies of randomly selected members of the previous generation.
-
-
-
- Elite Gap - This is used with the Variable Elitism and the Reverse Rank
- replacement policies. This parameter should be a real number in the following
- range [0.0, 1.0]. Elite Gap specifies the percentage of the population
- reserved for the best members so far. For example, suppose we have a
- population of 50 chromosomes and an Elite Gap of 0.1. This means we want to
- preserve the best 10% or 5 members.
-
-
-
- Prob. Cross - This is the crossover probability or how often crossover should
- occur. This should be a real number in the following range [0.0, 1.0].
-
-
-
- Prob. Mutation - This is the mutation probability or how often mutation should
- occur. This should be a real number in the following range [0.0, 1.0].
-
-
-
- Selection Bias - This is used with the Linear Rank selection technique. This
- is how much preference should be given to the better members of the population.
- Selection Bias should be a real number in the following range (1.0, 2.0]. For
- example, a selection bias of 1.5 means the best member of the population is 1.5
- times more likely to be selected than the median ranked individual of the
- population.
-
-
-
- Replacement Bias - This is used with the Reverse Rank replacement policy. This
- is how much pressure should be placed on replacing the worst individuals in the
- population. Replacement Bias should be a real number in the following range
- (1.0, 2.0]. For example, a Replacement Bias of 1.5 means the worst individual
- is 1.5 times more likely to be replaced than the median ranked individual.
-
-
-
- Maximum Spin - This is used to stop the experiment if no progress is being
- made. If a new chromosome has not be created in (Maximum Spin) generations
- then the population has probably converged and no real progress is being made
- so we should stop this experiment. This parameter should be an integer number
- greater than zero.
-
-
-
- Statistics Interval - This is how many trials should occur between printing
- experiment statistics. This parameter should be an integer greater than zero.
- Statistics are only calculated after (Population Size) attempted trials or
- roughly each generation. The number of actual trials (when a chromosome needs
- to be evaluated) may be less than the attempted trials. So the Statistics
- Interval is a lower bound on the number of trials that must occur before
- another set of statistics is printed. That is, the actual number of trials
- before the reporting of statistics may be slightly more than the specified
- Statistics Interval.
-
-
-
- Hamming Metrics - This is a boolean flag for whether the clustering and drift
- Hamming metrics (Frederick, Sedlmeyer and White 1991) for population
- convergence should be calculated and reported. Clustering measure the average
- Hamming distance between all pairs of chromosomes in the population. And drift
- measures the average Hamming distance between the best individual and all other
- chromosomes in the population.
-
-
-
- Best Size - This is how many of the best chromosomes developed should be saved
- and printed to a file (the best file described in Chapter 4) at the end of the
- experiment. This parameter should be an integer greater or equal to zero. If
- Best Size is greater zero, then the (Best Size) members with the best fitness
- values developed in the entire experiment are preserved in a separate area for
- reporting purposes only, regardless if any elitism policy is used in the other
- phases of the GA.
-
-
-
- Worst Size - This is used as a scaling mechanism with SUS selection. This is
- similar to the scaling window used in the GENESIS GA program. This parameter
- should be an integer number greater or equal to zero. If we are minimizing a
- numerical function, then the fitness of a member (used for the selection
- process) is the worst fitness value minus the function evaluation for this
- particular member. The worst fitness value is the worst member seen in the
- last (Worst Size * Population Size) trials, if Worst Size is greater than 0.
- This allows more differentiation between members whose fitness values are very
- close together. The smaller the Worst Size (greater than zero) the greater the
- differentiation. For a Worst Size of zero, the worst value is the worst member
- seen in the entire experiment thus far. This represents an infinite scaling
- window or very little ability to differentiation between very similar values.
- This scaling effect works analogously for maximization problems as well.
-
-
-
- Maximize - This is a boolean flag to indicate whether the evaluation function
- should be maximized or minimized. Turn the flag on (or yes) for maximization
- and off (or no) for minimization. Note: If you are setting parameters for one
- the problems provided with the GA Toolkit make sure you set this flag
- correctly, otherwise you will get unexpected results. The Description of
- Available Problems section above states whether each problem should be
- maximized or minimized.
-
-
-
- Evaluate All - This is a boolean flag to tell the system whether to evaluate
- all chromosomes or only those that need evaluation. Generally, if a new
- chromosome (offspring) is equal to one of its parents we do not need to
- re-evaluate the function again, but simply copy the function parameters and
- fitness measure from the matching parent. However, in noisy or non-stationary
- functions the same chromosome may have a different function evaluation at
- varying times, so all chromosomes inserted into the next generation need to be
- re-evaluated. Setting this flag on (or yes) forces all chromosomes to be
- evaluated at the end of each generation. If this parameter is turned off (or
- no), new chromosomes are only evaluated if they differ from their parents.
-
-
-
- Seed Population - This is a boolean flag for whether the initial population
- should be seeded with specific chromosomes from an input file. Setting the
- flag on (or yes) will cause the program to read the init file (or the
- init.suffix file if you provide the GA program a file suffix). This file
- should contain the chromosome strings you want to place in the initial
- population with one chromosome per line. This file can contain any where from
- one to an entire population of chromosomes.
-
-
-
- Debugging Level - This is used for tracing the procedure calls activated during
- the GA run. The debugging feature can produce and a lot of information that is
- not very interesting, but may be helpful in debugging a new technique. I would
- recommend always setting this flag to 0 (no debugging trace). However, if you
- are interested in this information a value of 1 shows entry and exit of all
- major procedures. A value of 2 produces the same information as a value of 1
- but also includes entry into class constructors and destructors. And finally a
- value of 3 shows the information for a value of 2 plus some addition
- information about major events that are occurring.
-
-
-
- Display - This is a boolean flag to indicate whether to show, on the screen,
- information about the GA's execution. This display includes information about
- the current trial, experiment statistics, the best chromosome so far and which
- GA components are being utilized. Setting the flag to on (or yes) turns the
- display mode on.
-
-
-
- Print Generations - This is a boolean flag to determine whether the population
- should be printed to a file (the pops file described in Chapter 4) at the end
- of each generation. Turning this flag on (or yes) will cause this information
- to be printed.
-
-
-
- Random Seed - This is the seed for the random number generation procedures.
- This parameter should be a long integer greater or equal to zero.
-
-
-
- Swapping Prob. - This is the swapping probability for Uniform crossover. This
- determines how frequently the positions on the chromosome string will be
- swapped. This should be a real number in the following range [0.0, 1.0].
-
- {tc "Creating the \"genes\" File" \L 3}Creating the "genes" File
-
- The genes file describes the structure of the chromosome string. This file
- specifies how many genes the chromosome contains, the length of each gene and
- the format for displaying the value of each gene. If you are applying a GA to
- one of the pre-defined problems that was installed with the GA Toolkit, then
- you do not need to create the genes file. The genes files for these problems
- have already been created and are located in the Input sub-directory. The
- suffix of the genes file corresponds the suffix of the file where the problem
- is defined (e.g., genes.ct1 is the genes file for the Counting Ones problem
- defined in the func.ct1 file).
-
- If you need to define a new genes file the following describes the process for
- accomplishing this task. Selecting the genes file from the Setup Input Files
- screen and pressing the return key brings up a dialog box asking whether you
- will be using the floating point representation. If the problem the GA is
- being applied to is defined using the floating point decoding mechanism (i.e.,
- the eval_chrom class that defines the problem is derived from the float_chrom
- class), then you should answer yes to this question. The answer to the
- floating point representation question determines what type of information is
- required in the genes file.
-
- After providing an answer to the floating point representation question, a
- screen divided into two sections will appear. The left side is again a menu
- that allows you to add, edit or delete gene specifications and save the
- information to the genes file. The right portion of the screen is a table
- format for displaying the gene specifications that you will create. For
- floating point representation, the table contains fields for the gene number,
- the minimum and maximum value of the floating point range, the number of values
- in the range, the format for printing the gene value and the length of the gene
- on the chromosome string. When you are not using the floating point
- representation the table contains three fields: the gene number, the format
- for printing the gene value and the length of the gene.
-
- Adding Genes
-
- The procedure for specifying the information in the genes file is similar
- whether or not the floating point representation is being used. The only
- difference is the fields that need to be specified. To add information about a
- gene highlight the Add Gene option in the menu on the left side of the screen
- and press the return key. Then the system will ask you to provide a value for
- each to the appropriate fields. The following describes each field you will be
- required to specify.
-
-
-
-
-
- Fields for Floating Point Representation
-
-
-
- Minimum Value - this is the smallest value in the desired floating point range.
-
-
-
- Maximum Value - this is the largest value in the desired floating point range.
-
-
-
- Number of Values - this is the number of values in the floating point range and
- defines the precision of the values. The value for this field must be a power
- of 2 (e.g., 2, 4, 8, 16, 32, 64, ...). For example, suppose the minimum value
- is -5.11, maximum value is 5.12 and the number of values is 1024, then the gene
- can represent values in the following range [-5.11, -5.10,
-
- -5.09, ... , 5.10, 5.11, 5.12] and the precision is 0.01.
-
-
-
- Format - this is the printf conversion string for printing the value of the
- gene. The gene values are store in an array of type double. The following are
- valid conversion strings:
-
-
-
- %f - for floating point conversion (e.g., 5.043245 using %.3f is printed as
- 5.043).
-
-
-
- %e and %E - for scientific notation (e.g., 504.3245 using %.4e is printed as
- 5.0432e+02 and
-
- using %.4E is printed as 5.0432E+02).
-
-
-
- %g and %G - %e or %E is used if the exponent is less than -4 or greater than
- or equal to the precision; otherwise %f is used (e.g., 504.3245 using %.5g is
- printed as 504.32 and
-
- 50432.45 using %.3g is printed as 5.04e+04).
-
-
-
- For more explanation of printf conversion strings refer to a C or C++
- programming manual. Note that since the values of the genes are stored as type
- double, the only valid conversion characters are f, e, E, g and G.
-
-
-
- Length - this field will automatically be calculated for you by using Number of
- Values field. For example, it the number of values field is 64 then the length
- is 6 positions on the chromosome string (log(64) base 2 = 6).
-
-
-
- Repetitions - After you have provided values for all the required fields the
- system will ask you how many repetitions of the specification you wish to
- create. If you have multiple genes with the same specification you will not
- need to repeatedly enter the same information over and over again, just specify
- the number of copies to create.
-
-
-
-
-
- Fields for Non-Floating Point Representation
-
-
-
- Format - this is again the printf conversion string for printing the gene
- value, similar to the format field with floating point representation (see
- above).
-
-
-
- Length - this is the number of positions on the chromosome string that the gene
- will occupy.
-
-
-
- Repetitions - similar to the floating point representation you can instruct the
- system to create multiple copies the gene specification you have just created.
-
-
-
- After you have finished the gene specification, the values for each field will
- be displayed in the table on the right side of the screen. If you create more
- than a screen full of genes then you can use SHIFT-U and SHIFT-D to scroll up
- or down to view the genes you have created. At this point you can edit or
- delete a gene you have created or add additional genes.
-
- Editing a Gene
-
- To edit a gene specification, highlight the Edit Gene option and press the
- return key. A dialog box will appear asking you for the number of the gene you
- wish to modify. The gene number is the first column in the table on the right
- portion of the screen. After you have provided the gene number a pop-up menu
- appears that contains the fields which you may edit. Using the up and down
- arrow keys to highlight the field to modify and press the return key. Another
- dialog box will appear asking you to enter a new value for the field. From the
- pop-up menu of fields you can modify any or all fields of that particular gene.
- When you are finished modifying the gene press CTRL-U to return to menu on the
- left side of the screen.
-
- Deleting a Gene
-
- If you made a mistake and want to completely remove a gene, highlight the
- Delete Gene option and press the return key. A dialog box will appear asking
- you for the number of the gene you wish to delete. Once you remove a gene the
- only way to undo this action is re-specific the gene through the Add Gene
- option on the menu.
-
- Saving the genes File and Exiting
-
- And finally, when you are finished creating the information about the genes,
- use the Save option to write this information to the genes file. To exit the
- genes specification section press the CTRL-U key combination. This will return
- you to the Setup Input Files screen, where you selected which file to create
- either the in or genes file. If you have already created both input files you
- can use the CTRL-U key combination to return to the main menu of the user
- interface program.
-
- {tc "Defining a New Problem" \L 2}Defining a New Problem
-
- The Create New Problem option on the main menu allows you to create a template
- file for defining the evaluation function for a new problem. Pressing the
- return key when the Create New Problem option is highlighted initiates this
- option. A series of dialog boxes will appear asking you questions about the
- new problem.
-
- You will first be asked to enter a suffix for the file that will contain the
- new problem. All evaluation functions are keep in files with "func" as the
- root name of the file. The file suffix allows the system to identify the
- proper file when you select a particular problem to solve. (Note: You should
- use a unique suffix otherwise the file for a previously defined problem will be
- replaced with the new file you are creating. See the Description of Available
- Problems above for the suffixes that are currently being used.)
-
- The second dialog box that will appear asks you to provide a short descriptive
- name for the problem. The name you provide is the description that will appear
- in the list of problems when you select the problem to solve through the Select
- Problem option on the main menu.
-
- The final question that you will be required to answer is the type of values
- that will be required by the new evaluation function. The system needs to know
- the type of values that are needed so that it can set up the appropriate
- representation and decoding mechanism. In this dialog box you have a menu of
- four choices. You can use the up and down arrows to highlight the desired type
- of values and press the return key to select that option. The following
- describes the four types of values.
-
-
-
- Properties of Chromosome String - this option is for evaluation functions that
- do not require the genes on the chromosome to be decoded as integers or
- floating point values. The fitness of the chromosome will be determined by
- some type of property of the binary chromosome representation. For example,
- the Counting Ones Problem would use this type of value since the fitness is
- related to the number of 1s in the chromosome string.
-
-
-
- Integers - Binary Decoding - this option is for evaluation functions that
- require integer parameters where you want the genes to be interpreted as binary
- code. That is, the genes on the chromosome string with be converted from the
- binary representation to decimal integers.
-
-
-
- Integers - Gray Decoding - this is similar to the previous option but the genes
- on the chromosome string will be interpreted as Gray code rather than binary
- code.
-
-
-
- Floating Point Values - this option is for evaluation functions that require
- floating point values. In the case of floating point values you do not need to
- specify the decoding mechanism (i.e., binary or Gray) that will be used when
- you define the problem. The decoding mechanism can be selected through the
- Select GA Components option on the main menu when you create the GA program
- that will be applied to this problem.
-
-
-
- After you have chosen the type of values that are required for the new problem,
- a message will appear telling you that the file for this problem has been
- successfully created and that you need to edit the file to add the program code
- for the evaluation function. Pressing the return key while this message is
- displayed will return to the main menu. If you want the edit the new file you
- will need to exit the user interface program.
-
- The files for all the evaluation functions are keep in the Source/Funcs
- sub-directory of the main directory where the GA Toolkit was installed. To
- edit the new file you will need to go to this directory. Your new file will
- have the "func" root name and the suffix you provided (e.g., func.test would be
- the file created if you specified "test" as the suffix). You can use any text
- editor to add the program code for your new evaluation function. The file you
- created will look similar to the following. The contents of the actual file
- are printed in a Courier font and additional comments about the file are
- printed in a Times Roman font.
-
-
-
- /* GA Toolkit - Sara A. Lienau
-
- *
-
- * Center for Advanced Technology
-
- * Computer Science and Telecommunications Program
-
- * University of Missouri - Kansas City
-
- * ------------------------------------------------------------------------
-
- * file: eval_chr.temp
-
- *
-
- * purpose: eval_chrom class template
-
- *
-
- * modified: 21 aug 91 - created
-
- */
-
-
-
- #include "extern.h"
-
- #include "chrom.h"
-
- #include "eval_chr.h"
-
-
-
- // It may be helpful to document some information about the function
-
- // here. This is entirely optional.
-
-
-
- /* Function Description
-
- The Name/Source item contains the short descriptive name that you specified for
- this problem. This is the name that will be displayed on the list of available
- problems from the Select Problem section of the user interface program.
-
- * Name/Source: Test Problem 1
-
- * Description:
-
- * Properties:
-
- * Min or Max:
-
- * Min:
-
- * Max:
-
- * Limits:
-
- * Space:
-
- The END_CHROM item is used to specify which class the eval_chrom class will be
- derived from. This determines what type of values will be provided to the
- evaluation function. If you need to change the type of values that you
- originally specified you should use one of the following class names:
- bit_chrom, gray_chrom or float_chrom. The bit_chrom class should be used for
- values from the properties of the chromosome string or integer values using
- binary decoding. The gray_chrom class should be used for integer values using
- Gray decoding. And the float_chrom class should be used for interpreting the
- genes as floating point values.
-
- * END_CHROM: gray_chrom
-
- * GENES info in:
-
- */
-
-
-
-
-
- /*------------------------------------------------------------------------
-
- -- class eval_chrom (objective function)
-
- ------------------------------------------------------------------------*/
-
- The get_name function is used as a kind of sanity check. This function returns
- the name of the problem being used and is printed in an output file (the log
- file described in Chapter 4). This is useful to confirm that you are solving
- the problem you intended.
-
- char* eval_chrom::get_name(void){
-
- // get name - returns a string describing the evaluation function
-
- // and chromosome representation
-
- char name[60];
-
-
-
- strcpy(name, "Test Problem 1, ");
-
-
-
- strcat(name, END_CHROM::get_name());
-
- return( name );
-
- }
-
- The eval function is where you will insert the program code for calculating the
- fitness of each chromosome.
-
- void eval_chrom::eval(void){
-
- // apply evaluation function to chrom
-
- The decode function is used to decode the genes of the chromosome string into
- the appropriate type of values for your evaluation or fitness function. As
- noted in the comments in the file, you only need to call the decode function if
- the evaluation function requires floating point or integer parameters. If you
- are calculating the fitness value on some property of the chromosome string,
- you can comment out or remove the call to the decode function.
-
- // decode bit string first (only necessary if func. requires floating
-
- // point or integers)
-
- END_CHROM::decode();
-
-
-
- /*---- BEGIN MODIFY ----*/
-
- /* Insert code for evaluation function here. Final value MUST be
-
- * assigned to the variable "fitness".
-
- *
-
- * Summary of Variables
-
- * str - Chromosome Bit String
-
- * array of unsigned short int
-
- * LCHROM - size of "str" array
-
- * vect - Decoded Genes of Chromosome if decode function called
-
- * array of double
-
- * GENES - size of "vect" array
-
- *
-
- * CAUTION: str, LCHROM & GENES should be treated as read-only
-
- * variables. Assigning values to these variables may cause
-
- * unpredictable behavior. However, you may assign values to the
-
- * elements of the "vect" array if you want something other than
-
- * the decoded genes to be printed by the chromosome print function.
-
- */
-
-
-
- The above comments give a summary of the variable you have access to for
- calculating the fitness of the chromosome. As an example suppose we want to
- optimize the function {EMBED Equation |} and we have properly defined three
- genes (one for each parameter) in the genes file. The following is an example
- of how to code the fitness function. The decode function above will be called
- to place the decoded values of the genes in the "vect" array.
-
- double x, y, z;
-
-
-
- // vect[0] is the x parameter, vect[1] is the y parameter and vect[2] is the z
- parameter
-
-
-
- x = vect[0] * vect[0]; // x^2
-
- y = vect[1] * 3; // 3*y
-
- z = vect[2] - 2;
-
- z = z * z; // (z-2)^2
-
-
-
- fitness = x + y + z;
-
-
-
- /*---- END MODIFY ----*/
-
-
-
- // clear eval flag
-
- do_eval = 0;
-
- }
-
-
-
- After you have written the program code for the new problem you can apply a GA
- program to this problem by selecting it through the Select Problem option on
- the main menu.
-
-
-
- Chapter 4
-
- {tc "4. EXECUTING YOUR GA PROGRAM"}Executing YOUR GA Program
-
- This chapter describes how to execute the GA programs you create through the
- user interface program and the output files produced by the GA program.
-
- The GA program works with several input files and creates several output files.
- You can direct the GA program to used the default names for these files (i.e.,
- file with no suffix) or used the files with a specified suffix. To execute the
- GA program use the name you assigned to the program as the command and you may
- supply an optional parameter for the file suffix (i.e., % ga_program_name
- [suffix]). For example, if you selected the default name for your GA program
- (i.e., GA.EXE) and you want the program to read input files with the "test"
- suffix (i.e., in.test and genes.test) and create output files with this suffix
- use the following command.
-
-
-
-
-
- % GA.EXE test
-
-
-
- If you do not supply the optional command line parameter the GA program will
- look for the default input files (i.e., in and genes) and the program will use
- the default names for the output files created during the run (i.e., best, log,
- out, and pops). The file suffix options allows you to save the input and
- output information about different experiments without continually renaming the
- files used by the system.
-
-
-
- {tc "Description of Output Files" \L 2}Description of Output Files
-
-
-
- best The best file contains the specified number of the best chromosomes
- developed during each experiment. The number of chromosome printed in this
- file is controlled by the Best Size parameter in the in file. This file is
- only created if the Best Size parameter is greater than zero. This file
- contains the chromosome string, the value of each gene, the fitness value and
- the generation and trial when the chromosome was first created. The following
- is an example of this file.
-
-
-
- Experiment: 1
-
-
-
- 101101010101001110100101110011
-
- 2.13 -1.98 -1.41
-
- 0.5954 10 193
-
-
-
- The first line is the chromosome string. The second line show the values of
- the genes defined for this chromosome. The last line contains the fitness
- value (0.5954) and the generation (10) and trial (193) when this chromosome was
- created, respectively.
-
-
-
- log The log file contains general information about the GA experiment. It
- contains the date and time the experiment was conducted, most of the values of
- the input parameters and the GA components used for each phase of the GA
- process. The following is an example of a log file.
-
-
-
- Experiment Conducted: Sun May 3 13:29:10 1992
-
-
-
- Input Parameters:
-
- Total Experiments = 1
-
- Total Trials = 200
-
- Population Size = 20
-
- Chromosome Length = 30
-
- DeJong Gap = 1.0000
-
- Elite Gap = 1.0000
-
- Prob. Cross = 1.0000
-
- Prob. Mutation = 0.0010
-
- Selection Bias = 1.5000
-
- Replacement Bias = 1.5000
-
- Maximum Spin = 3
-
- Best Size = 1
-
- Worst Size = 3
-
- Maximize = 0
-
- Random Seed = 123456789
-
-
-
- Genetic Algorithm Components
-
- Chromosome: Shifted 3-D Parabola, Floating Pt, 1-D Bit Str.
-
- Reproduction: Generational
-
- Selection: GENESIS Rank
-
- Replacement: DeJong Gap w/Single Elitism
-
- Crossover: 1 PT Reduced Surrogate
-
- Mutation: Canonical Mutation
-
- Statistics: Basic Stats
-
-
-
- out The out file contains the experiment statistics that are printed after each
- Statistics Interval (input parameter in the in file). At each Statistics
- Interval the following information in printed.
-
-
-
- Current Generation - this is the number of generational cycles completed.
-
- Current Trial - this is the number of trials or chromosomes created so far.
-
- Lost Allele - this is the number of allele (positions on chromosome) where the
- entire population has the same value.
-
- Converged Allele - this is the number of allele where at least 80% of the
- population has the same value.
-
- Allele Bias - this is the average percentage of the most prominent value for
- each allele. For example, a bias of 0.70 means that, on average, each position
- has converged to 70% 0s or 70% 1s. The minimum bias is 0.5 when there is an
- equal distribution of 1s and 0s for each allele.
-
- On-line performance - this is the mean of all solutions tested to this point in
- the run. (DeJong 1975)
-
- Off-line performance - this is the mean of the best solutions developed at the
- end of each generation. (DeJong 1975)
-
- Best - this is the fitness value of the best solution in the current
- population.
-
- Average - this is the average fitness of the population at this point in the
- experiment.
-
-
-
- If the Hamming Metrics flag is turned on then the On-line and Off-line
- performance fields are replaced by the Clustering and Drift metrics,
- respectively.
-
-
-
- Clustering - this is the average Hamming distance between each pair of
- chromosomes in the population. A value near 1.0 indicates a lot of diversity
- in the population. As the clustering metric moves toward 0.0 the population is
- converging and losing diversity. (Frederick, Sedlmeyer and White 1991)
-
- Drift - this is the average Hamming distance between the best chromosome and
- all other chromosomes in the population. Similar to the clustering metric, a
- value near 1.0 indicates a lot of diversity and a value of 0.0 means the
- population has completely converged to a single chromosome. (Frederick,
- Sedlmeyer and White 1991)
-
-
-
-
-
- The following is an example of the out file.
-
-
-
- Experiment: 1
-
- Gen Trials Lost Conv Bias OnLine Offline Best Average
-
-
-
- 0 20 0 0 0.582 3.2093e+01 1.1283e+01 2.874900e+00 3.209285e+01
-
- 2 56 0 2 0.635 2.8387e+01 5.8203e+00 2.750900e+00 2.168309e+01
-
- 4 94 0 6 0.670 2.3457e+01 4.4480e+00 9.473000e-01 1.582167e+01
-
- 6 134 0 8 0.687 1.9270e+01 3.3912e+00 9.033000e-01 7.569095e+00
-
- 8 170 3 13 0.732 1.6394e+01 2.8505e+00 8.090000e-01 4.364355e+00
-
- 10 208 3 18 0.772 1.4202e+01 2.4559e+00 5.954000e-01 4.710895e+00
-
-
-
- pops The pops file contains a printed copy of all chromosomes in each
- generation. This file is only created if the Print Generation flag is turn on
- in the in file. Depending on the population size and the maximum number of
- trials this file can be very large. The following shows the format of this
- file.
-
-
-
- Experiment: 1
-
-
-
- GENERATION: 0
-
- 111110010010000101011000101011
-
- 4.84 0.21 0.43
-
- 16.8946 0 1
-
- 001101111100110001010011010010
-
- -2.89 -3.15 -3.02
-
- 27.6750 0 2
-
- 010101001000111011111110010101
-
- -1.74 -2.73 4.05
-
- 52.1030 0 3
-
-
-
- Each chromosome printed in the pops file has the same format as those printed
- in the best file. The first line contains the chromosome string. The second
- line displays the value of each gene defined for the chromosome. And the last
- line contains the fitness value and the generation and trials when the
- chromosome was created, respectively.
-
-
-
- CHAPTER 5
-
- {tc "5. PROGRAMMER'S REFERENCE"}Programmer's Reference
-
- {tc "Description of Classes" \L 2}Description of Classes
-
-
-
- Chromosome Classes
-
-
-
- base_chrom - This class is the root of the chromosome hierarchy of classes.
- The base_chrom class provides several data items that are common to all
- chromosomes regardless of their actual implementation, such as a variable for
- the fitness value. (Files chrom.h & chrom.C)
-
-
-
- bit_chrom - The bit_chrom class implements a one-dimensional binary string
- chromosome representation. This class is derived from base_chrom and provides
- services for the other GA phases to access and modify the chromosome structure.
- (Files chrom.h & chrom.C)
-
-
-
- gray_chrom - The gray_chrom class is derived from bit_chrom and overloads the
- decoding method so that the chromosome can be interpreted as Gray code (rather
- than binary). (Files chrom.h & chrom.C)
-
-
-
- float_chrom - The float_chrom class provides a decoding mechanism that allows
- the chromosome to be decoded as floating point values rather than integers.
- The decoding procedure first decodes the chromosome to integers and then maps
- the integer value to a floating point range. This class can be derived from
- either bit_chrom or gray_chrom so that the chromosome can be interpreted as
- binary or Gray code. (Files chrom.h & chrom.C)
-
-
-
- eval_chrom - The eval_chrom class implements the problem specific evaluation or
- fitness function. This class can be derived from bit_chrom, gray_chrom or
- float_chrom depending on what type of values the evaluation function requires
- (i.e., integers or real numbers). (File eval_chr.h & eval_chr.C)
-
-
-
- Reproduction Classes
-
-
-
- base_pop - The base_pop class is the root of the population hierarchy of
- classes. It implements data structures and procedures that are common to both
- generational and individual reproduction. For example, both style of
- reproduction need at least on population data structure and they share common
- initialization and termination procedures for the basic execution cycle of the
- GA process. (Files populat.h & populat.C)
-
-
-
- pop_gen - The pop_gen class is derived from base_pop and implements the basic
- execution cycle for generational reproduction. (Files pop_gen.h & pop_gen.C)
-
-
-
- pop_ind - The pop_ind class is derived from base_pop and implements the basic
- execution cycle for individual reproduction, where one or two offspring are
- created and immediately inserted into the population. (Files pop_ind.h &
- pop_ind.C)
-
-
-
- Selection Classes
-
-
-
- rank_sel - The rank_sel class implements Linear Rank selection and can be used
- with either generational or individual reproduction. (Files rank_sel.h &
- rank_sel.C)
-
-
-
- rank_sus_gen - The rank_sus_gen class implements GENESIS Rank selection for
- generational reproduction. (Files rk_sus_gen.h & rk_sus_gen.C)
-
-
-
- rank_sus_ind - This class implements GENESIS Rank selection for individual
- reproduction. (Files rk_sus_ind.h & rk_sus_ind.C)
-
-
-
- sus_gen - The sus_gen class implements SUS selection for generational
- reproduction. (Files sus_gen.h & sus_gen.C)
-
-
-
- sus_ind - This class implements SUS selection for individual reproduction.
- (Files sus_ind.h & sus_gen.C)
-
-
-
- Replacement Classes
-
-
-
- base_replace_g - The base_replace_g class is the root of the hierarchy for
- generational replacement policies. This class implements a shared procedure
- for storing new offspring until an entire generation has been created. (Files
- repl_gen.h & repl_gen.C)
-
-
-
- gen_gap_e1 - The gen_gap_e1 class is derived from the base_replace_g class and
- implements DeJong's Gap with Single Elitism replacement policy for generational
- reproduction. (Files repl_gen.h & repl_gen.C)
-
-
-
- gen_e_n - The gen_e_n class is derived from the base_replace_g class and
- implements the Variable Elitism replacement policy for generational
- reproduction. (Files repl_gen.h & repl_gen.C)
-
-
-
- killer - The killer class implements the Remove Worst (kill_worst procedure),
- Reverse Rank (rev_rank procedure) and Reverse Tournament (rev_tourn procedure)
- replacement policies for individual reproduction. (Files killer.h & killer.C)
-
-
-
- repl_rank - The repl_rank class implements tasks that Linear Rank selection,
- when used with individual reproduction, requires the replacement phase to
- accomplish (i.e., inserting new offspring in the proper order according to
- their rank). This class is derived from the killer class so that the
- replacement policy can be varied independent of the additional tasks that the
- selection technique requires. (Files repl_rank.h & repl_rank.C)
-
-
-
- repl_rk_sus - The repl_rk_sus class implements tasks that GENESIS Rank
- selection, for individual reproduction (rank_sus_ind), requires the replacement
- policy to accomplish. This class is derived from the killer class. The
- repl_rk_sus class also implements the Generational Emulation replacement policy
- (the kill_term procedure). (Files repl_rk_sus.h & repl_rk_sus.C)
-
-
-
- repl_sus - The repl_sus class implements tasks that SUS selection, for
- individual reproduction (sus_ind), requires the replacement phase to
- accomplish. This class is derived from the killer class. The repl_sus class
- also implements the Generational Emulation replacement policy (the kill_term
- procedure). (Files repl_sus.h & repl_sus.C)
-
-
-
- Crossover Classes
-
-
-
- n_pt - The n_pt class implements Traditional crossover where the cut points are
- randomly selected. (Files n_pt.h & n_pt.C)
-
-
-
- reduced_sur - The reduced_sur class implements the Reduced Surrogate crossover
- technique. (Files reduced.h & reduced.C)
-
-
-
- one_pt - The one_pt class can be derived from either n_pt or reduced_sur to
- provide a one point crossover procedure for either Traditional or Reduced
- Surrogate crossover, respectively. (Files cross.h & cross.C)
-
-
-
- two_pt - The two_pt class similar to the one_pt class can be derive from either
- n_pt or reduced_sur for a crossover technique where two cut points are
- selected. (Files cross.h & cross.C)
-
-
-
- uniform_xo - The uniform_xo class implements Uniform crossover. This class is
- derived from n_pt which provides the procedure for determining when crossover
- should be performed according to the crossover probability. But the uniform_xo
- class overloads the procedure for selecting the actual cut points. (Files
- unixo.h & unixo.C)
-
-
-
- Mutation Class
-
-
-
- can_mute - The can_mute class implements the Canonical mutation technique.
- (Files can_mute.h & can_mute.C)
-
-
-
- Statistics Classes
-
-
-
- basic_stat - The basic_stat class is used to collect and report various
- statistics about the GA experiment. (Files stat.h & stat.C)
-
-
-
- extreme_perfs - The extreme_perfs class is used by the basic_stat class for
- monitoring the best and worst performances or solutions developed by the GA.
- The best individuals may be printed to a file at the end of each experiment.
- The worst solutions developed are used as a scaling mechanism with
- fitness-based selection techniques (e.g., SUS). (Files stat.h & stat.C)
-
- {tc "Description of #define Directives" \L 2}Description of #define Directives
-
- With the GA Toolkit the user is easily able to exchange the components used for
- each phase of the GA process. To exchange components without manually
- modifying the program code, #define preprocessor directives of the C++ language
- were used where a particular class name needed to specified in the source code.
- Different techniques are easily exchange by changing the class name that is
- associated with the appropriate #define identifier. For example, the SELECT
- identifier specifies which selection technique the user desires. So when we
- needed to create an instance of the a selection class in the source code, we
- used the SELECT identifier name rather than the actual class name. Then we can
- easily change which selection technique is used by associating a different
- class name with the SELECT identifier. For example,
-
-
-
-
-
- #define SELECT sus_ind // SUS selection for individual reproduction
-
-
-
- // create instance of a selection class
-
- SELECT select_obj;
-
-
-
- // rather than
-
- sus_ind select_obj;
-
-
-
- The following describes the purpose of each #define identifier and the class
- names that may be associated with each identifier.
-
-
-
-
-
- #define
-
- Identifier
-
-
-
- Valid Class
-
- Names
-
-
-
- Description
-
-
-
- REP_CHROM
-
- bit_chrom
-
- The REP_CHROM identifier specifies the representation chromosome class.
- (Located in file def_rep.h)
-
-
-
-
-
- CODE_CHROM
-
- bit_chrom
-
- gray_chrom
-
- The CODE_CHROM identifier specifies the class that implements the desired
- method of chromosome decoding . (Located in file def_rep.h)
-
-
-
-
-
- END_CHROM
-
- bit_chrom
-
- gray_chrom
-
- float_chrom
-
-
-
- The END_CHROM identifier specifies the class that the eval_chrom class will be
- derived from. (Located in file def_rep2.h)
-
-
-
-
-
- CHROM
-
- eval_chrom
-
- The CHROM identifier specifies the complete chromosome class that all the other
- phases manipulate (e.g., selection and crossover). (Located in file def_chr.h)
-
-
-
-
-
- POP_TYPE
-
- pop_gen
-
- pop_ind
-
- The POP_TYPE identifier specifies the style of reproduction (generational or
- individual) that will be used. (Located in file def_pop.h)
-
-
-
-
-
- SELECT
-
- rank_sel
-
- rank_sus_gen
-
- rank_sus_ind
-
- sus_gen
-
- sus_ind
-
-
-
- The SELECT identifier specifies the selection technique to use. (Located in
- file def_phase.h)
-
-
-
-
-
-
-
-
-
-
-
- REPLACE
-
- gen_gap_e1
-
- gen_e_n
-
- repl_rank
-
- repl_rk_sus
-
- repl_sus
-
- The REPLACE identifier specifies the replacement class to use. (Located in
- file def_phase.h)
-
-
-
-
-
-
-
-
-
-
-
-
-
- KILLER
-
- kill_worst
-
- rev_rank
-
- rev_tourn
-
- kill_term
-
- The KILLER identifier specifies the replacement policy when using individual
- reproduction. (Located in file def_kill.h)
-
-
-
-
-
-
-
-
-
- CROSS_TYPE
-
- n_pt
-
- reduced_sur
-
- The CROSS_TYPE identifier specifies which class to use for selecting the
- desired number of cut points. This is the class where one_pt or two_pt will be
- derived from. (Located in file def_xo.h)
-
-
-
-
-
- CROSS
-
- one_pt
-
- two_pt
-
- uniform_xo
-
- The CROSS identifier specifies the crossover technique to use. (Located in
- file def_phase.h)
-
-
-
-
-
-
-
-
-
- MUTE
-
- can_mute
-
- The MUTE identifier specifies the mutation technique to use. (Located in file
- def_phase.h)
-
-
-
-
-
- STATS
-
- basic_stat
-
- The STATS identifier specifies the class to use for collecting and reporting
- experiment statistics. (Located in file def_phase.h)
-
-
-
-
-
- {tc "Adding a New GA Component" \L 2}Adding a New GA Component
-
- This section describes how to add a new GA component to the Toolkit's library.
- This process requires that you create the program code for your new component
- and modify several of the existing source code files. The following describes
- the necessary steps and illustrates this process by showing how the uniform
- crossover technique was added to the Toolkit.
-
- 1. Create Program Code for New Component
-
- The first step is to write the program code for your new component. The major
- rule you must follow is that the class definition of your new component must
- have the same public interface as the classes already defined in the Toolkit
- for that phase. This is so that you can easily exchange the GA components.
- The header files and comments in the source code describe the interfaces for
- each phase. The source code files for your component should be placed in the
- Source sub-directory where the Toolkit was installed. The more familiar you
- are with the source code and the #define identifiers the easier it will be to
- develop the code for your new technique.
-
- As an example, the following describes the modifications that were necessary to
- incorporate the uniform crossover technique. The first step was to look at the
- public interface for the crossover phase. In the crossover hierarchy there
- were originally four classes: one_pt, two_pt, n_pt, and reduced_sur. The
- one_pt and two_pt classes implement crossover with the corresponding number of
- cut points. These classes can be derived from either n_pt or reduced_sur so
- that the desired number of cut point(s) can be randomly selected (n_pt) or
- selected by the Reduced Surrogate method (class reduced_sur, which attempts to
- create offspring that differ from the parents whenever possible). The other
- phases of the GA will interact with an instance of the one_pt or two_pt class.
- So we should first look at the public interface for these classes. The class
- definition for one_pt is shown below (this is in the cross.h file).
-
-
-
- class one_pt : public CROSS_TYPE {
-
- public:
-
- one_pt(void) : CROSS_TYPE(1){}
-
- char* get_name(void);
-
- };
-
- The use of the CROSS_TYPE identifier allows us change which class (n_pt or
- reduced_sur) one_pt will be derived from. The public interface contains a
- constructor and the get_name method. The constructor calls the parent class'
- constructor with a parameter for the number of cut points (i.e., 1). The
- get_name method returns the name of the component for printing in the log file.
- Now we need to look at the parent class for methods that it inherits. The
- following shows the class definition for the n_pt class (located in the n_pt.h
- file).
-
-
-
- class n_pt {
-
- public:
-
- n_pt(int);
-
- ~n_pt(void) { delete cuts; }
-
- char* get_name(void);
-
- void cross(CHROM*, CHROM*, CHROM*, CHROM*);
-
- protected:
-
- int *cuts; //array of cut positions
-
- int no_cuts; //number of cuts (n)
-
- int total_cross; //total crosses to perform
-
- int cur_cross; //crosses counter
-
-
-
- virtual void pick_cuts(void); //pick cut point(s)
-
- };
-
-
-
- The cross method is the main procedure for implementing the crossover
- technique. The cross method determines whether to perform crossover based on
- the crossover probability. If crossover should be performed the pick_cuts
- method is called to randomly select the desired number of cut points. Then the
- cross method will pass this list of cut point(s) to the chromosome object to
- actually exchange the desired segments from the parents to the offspring.
-
- The class we created for uniform crossover needed a similar cross method but we
- needed to pick the cut points differently. So we created a new class derived
- from the n_pt class and overloaded the pick_cuts method. We also needed to
- overload the get_name method so that it would return a name for the uniform
- crossover method. The following shows the class definition and implementation
- of the uniform_xo class (located in the unixo.h and unixo.C files).
-
-
-
- class uniform_xo : public n_pt {
-
- public:
-
- uniform_xo(void) : n_pt(LCHROM-1){}
-
- char* get_name(void);
-
- protected:
-
- void pick_cuts(void);
-
- };
-
-
-
- char* uniform_xo::get_name(void){
-
- return( "Uniform Crossover" );
-
- }
-
-
-
- void uniform_xo::pick_cuts(void){
-
- int i;
-
- int cur_toss, prev_toss;
-
-
-
- prev_toss = flip(SWAP_PROB);
-
- no_cuts = 0;
-
- for( i=1; i < LCHROM; i++ ){
-
- cur_toss = flip(SWAP_PROB);
-
- if(cur_toss != prev_toss){
-
- cuts[no_cuts] = i;
-
- no_cuts++;
-
- prev_toss = cur_toss;
-
- }
-
- }
-
- }
-
- The constructor calls the n_pt constructor with the maximum number of cut
- points allowed so that the data structures will be initialized correctly. The
- get_name method just returns the name of the technique. And the pick_cuts
- method determines whether each position on the chromosome string should be
- swapped, according to the swapping probability. If the previous position
- should be swapped and the current position should not (and vice versa) then
- this is noted as a cut point. After the pick_cuts methods develops of the list
- cut points the cross procedure (inherited from n_pt) will call a chromosome
- method to actually exchange the segments between the parents and the offspring.
-
- 2. Update the Appropriate #define Identifier
-
- After you create the program code for your new method, you will need to update
- the file that contains the appropriate #define identifier. The Description of
- #define Directives section above describes the purpose of each identifier and
- the file where it is located. You will need to add a line to the file that
- contains the appropriate identifier for the type of component you are adding.
- A little later we will get more specific about the necessary modification, but
- first you need to decide which #define identifier is appropriate for your new
- component.
-
- Returning to the uniform crossover example we have two possible #define
- identifiers the might apply: CROSS_TYPE and CROSS. CROSS_TYPE is used to
- specify which class the one_pt and two_pt classes will be derived from. The
- CROSS identifier specifies which class is used for the crossover phase. Since
- we have already specified that the uniform_xo class will be derived from the
- n_pt class the CROSS_TYPE identifier is not applicable in this case. Moreover,
- the uniform_xo class implements all the public interface features of a
- technique for the crossover phase and thus it should be under the CROSS
- identifier category.
-
- The following shows part of the def_phase.h file, that contains the CROSS
- #define identifier, and discusses the modifications that are require to update
- this file.
-
-
-
- // -- crossover
-
- //#define CROSS uniform_xo
-
- #define CROSS one_pt
-
- //#define CROSS two_pt
-
-
-
- // -- select
-
- // ****** gen style
-
- //#define SELECT sus_gen
-
- //#define SELECT rank_sel
-
- #define SELECT rank_sus_gen
-
- // ****** individual style
-
- //#define SELECT sus_ind
-
- //#define SELECT rank_sus_ind
-
-
-
- In the file that contains the appropriate #define identifier you will need to
- add a line containing the #define preprocessor directive with the identifier
- name and the class name for your new component. For example, "//#define CROSS
- uniform_xo". The new line must be preceded by the comment symbol "//" (the new
- style of C++ comments). This is necessary because the user interface program
- changes which components are used by commenting out the #define directive of
- the active component and then removing the comment symbol ("//") from the
- #define directive for the component you selected. As you can see from the
- section of the def_phase.h file shown above, only one definition for each
- identifier will be active (i.e., not commented out) at any one time.
-
- There are several reasons for listing all the appropriate values for each
- identifier and keeping only one active by removing the comment symbol. First,
- this is a convenient method to document the legal values for each identifier.
- Second, this is an addition level of security to prevent inappropriate values
- from being assigned to a #define identifier since the system must find an exact
- match before it will remove the comment symbol.
-
- 3. Update Header Files
-
- The next step is to place an #include preprocessor directive for the header
- file of your new class in one of several composite header files. The composite
- header files are files the contain nothing but #include directives for other
- header files. The use of these composite header files simplifies the process
- of making the appropriate section of the program code aware of and be able to
- use your new component. So instead of having locate all the places in the
- source code that need access to your new header file, you only need to add an
- #include directive in one of the composite header files. And the composite
- header files are already included in the appropriate places in the existing
- source code.
-
- Which composite header file you need to modify will depend on type of component
- you are adding to the system. The following describes the type of components
- that should be included in the different composite header files.
-
-
-
- hdr_chr.h - this file includes header files for classes in the chromosome
- hierarchy.
-
-
-
- hdr_xo.h - this file includes header files for classes in the crossover
- hierarchy.
-
-
-
- hdr_gen.h - this file includes header files for components that can only be
- used with generational reproduction GAs and that were not included in the
- previous composite header files (i.e., hdr_chr.h or hdr_xo.h).
-
-
-
- hdr_ind.h - this file includes header files for components that can only be
- used in individual reproduction GAs and that were not included in the previous
- composite header files.
-
-
-
- hdr_both.h - this file includes header files for components that can be used
- with either generational or individual reproduction GAs and that were not
- included in the previous composite header files.
-
- For example, we added the following line to the hdr_xo.h file for the uniform
- crossover technique.
-
-
-
- #include "unixo.h"
-
-
-
- NOTE: You only need to add the #include directive for the header file of your
- new component to one of the listed composite header files.
-
- 4. Update makefile
-
- At this point you have made all the necessary modification to the source code.
- Now its time to update the makefile so the make utility will be able to compile
- your new component and link it with other modules to create an executable GA
- program. There are several modifications you will need to make to the makefile
- located in the Source sub-directory.
-
- First, near the top of the makefile there are several variables defined with a
- list of object files, you will need to add the object file for your new
- component to one of these variables. The BOTH variable contains object files
- that can be used with either generational or individuals reproduction GAs. The
- GEN variable contains object files for components that can only be used with
- generational reproduction GAs. And the IND variable contains object files for
- components that can only be used with individual reproduction GAs. The
- following shows the this portion of the makefile.
-
-
-
- # object files for components that can be used with either
-
- # generational or individual reproduction
-
-
-
- BOTH = main.o chrom.o eval_chr.o cross.o n_pt.o rank_sel.o reduced.o \
-
- can_mute.o stat.o populat.o unixo.o
-
-
-
- # object files needed for Generational Reproduction GAs
-
-
-
- GEN = repl_gen.o sus_gen.o rk_sus_gen.o pop_gen.o
-
-
-
- # object files needed for Individual Reproduction GAs
-
-
-
- IND = killer.o repl_rank.o repl_sus.o sus_ind.o repl_rk_sus.o \
-
- rk_sus_ind.o pop_ind.o
-
-
-
- The object file for uniform crossover (unixo.o) was added to the BOTH variable
- since it can be used with either type of GA.
-
- The second modification to the makefile is to add the dependency information
- for the object file of your component and add the commands for updating the
- object file if the dependents have been modified since the object file was
- created. The following shows the section of the makefile relating to the
- crossover object files.
-
-
-
- #----- Crossover Classes - implement crossover techniques
-
- #
-
- # cross.* implements 1 and 2 point crossover
-
- # n_pt.* implements Traditional crossover (where cut points
-
- # randomly selected).
-
- # reduced.* implements Reduced Surrogate crossover
-
- # unixo.* implements Uniform crossover
-
-
-
- cross.o : cross.C hdr_xo.h extern.h
-
- CC +i -c -g cross.C
-
- rm *..c
-
-
-
- n_pt.o : n_pt.C n_pt.h $(HDRS)
-
- CC +i -c -g n_pt.C
-
- rm *..c
-
-
-
- reduced.o : reduced.C reduced.h $(HDRS)
-
- CC +i -c -g reduced.C
-
- rm *..c
-
-
-
- unixo.o : unixo.C hdr_xo.h extern.h
-
- CC +i -c -g unixo.C
-
- rm *..c
-
- For example, the uniform crossover object file is depended on several header
- files and the source code file that implements the methods defined for this
- class. The lines below the dependency information are the commands necessary
- to update date the object file.
-
- And the final modification to the makefile is to add a new dependent for the
- composite header file where you added the #include directive for your
- component's header file in Step 3 above. The following shows the section of
- the makefile for the hdr_xo.h file where we added the #include directive for
- the header file of the uniform crossover technique.
-
-
-
- hdr_xo.h : n_pt.h reduced.h cross.h unixo.h
-
- touch hdr_xo.h
-
-
-
- Here we only need to modify the time and date of the file (by the touch
- command) to update the composite header file. This causes the object files
- that have this file as a dependent to be updated since the hdr_xo.h file is now
- more currently than the object files.
-
- 5. Create a Help File
-
- Now you need to create a help file for your new component. The information in
- this file will be displayed when the user selects your component and presses
- the F1 key while in the Select GA Components section of the user interface
- program.
-
- All of the help files are located in the Help sub-directory under the main
- directory where the GA Toolkit was installed. You will need to go to this
- directory to create the help file. In the Help directory there is a file
- called HELP.TEMP. This provides a template for creating your help file. To
- create your help file you will need to make a copy the HELP.TEMP file. Then
- you will need to edit your new file with a text editor and add the information
- about your new component.
-
- 6. Update Initialization File for the User Interface Program
-
- Now comes the fun part! For you to be able to choose your new component
- through the user interface program you need to update an initialization file
- the system reads about the available components. By placing this information
- in an input file, rather than hard-coding it, it is easier to update the user
- interface program when new components are added. This initialization file
- (called gat.ini in the Uif sub-directory) contains information about how to
- display the components and the constraints on which component can be used
- together.
-
- The first line of the file contains a number for the number of entries in the
- file. The user interface program uses an array data structure to store and
- manipulate the information about each component. Knowing the number of entries
- in the file makes it easier to create this data structure. You will need to
- increment this number for each entry you add to this file.
-
- The entries about each components contain several fields. Each field, with the
- exception of Constraints, must be delimited by commas. The following describes
- each field. An actual portion of the gat.ini file will be presented and
- explained later.
-
-
-
- Menu Category - this is the name to display on the menu of components (e.g.,
- Chromosome or Selection).
-
-
-
- Sub-Menu - this indicates whether there is a sub-menu for the category and the
- name of the sub-category (e.g., Representation and Decoding are sub-categories
- for Chromosome). Use "None" in this field if there is no sub-category.
-
-
-
- Name of Component - this is name that will be displayed for the component
- (e.g., GENESIS Rank and Linear Rank components in the Selection category).
-
-
-
- #define Identifier - this is the #define identifier associated with the
- component (e.g., CROSS is the #define identifier for Uniform Crossover).
-
-
-
- Value for #define - this the value to associate with the #define identifier
- when the component is selected (e.g., we would give CROSS the value uniform_xo
- when Uniform Crossover is selected).
-
-
-
- Help file - this is the name of the file (in the Help sub-directory) that
- contains information about the component to displayed when the user presses the
- F1 key.
-
-
-
- Number of Constraints - this is the number of lines that follow describing the
- constraints about which components can be used with this technique.
-
-
-
- Constraints - each constraint is on a separate line following the entry for the
- component. Each constraint line must begin with a tab character. Constraints
- have the following format:
-
-
-
- identifier operator value
-
-
-
- The system currently understands the four operators described below. Examples
- of these operators will be discussed later.
-
-
-
- = This is the equal operator. It means that the current value associated with
- the #define identifier must be equal to the value specified in the constrain
- for this component to be compatible with the current configuration.
-
-
-
- != This is the not equal operator. It means that as long as the current value
- associated with the #define identifier is not equal to the value specified in
- the constraint then this component is compatible with the current
- configuration.
-
-
-
- > This is the implies operator. It means that if this component is selected
- and all the other constraints are satisfied then the component designated by
- the #define identifier and value of the constraint must also be selected. For
- example, if you are using SUS selection with individual reproduction, the
- repl_sus class must be used in the replacement phase.
-
-
-
- ?= This is the if operator. It means that if the identifier is currently
- associated with the value in the constraint then evaluate the constraints in
- the following lines preceded by two tab characters.
-
- The follow shows portions of the gat.ini file. The actual contents of the file
- are print in a Courier font and additional comments explaining the entries are
- printed in a Times Roman font.
-
-
-
- The first three entries in the file are for the chromosome category. The first
- entry is for the Representation sub-category. The name of the component is 1-D
- Bit String (the third field). When this component is selected the system will
- assign the REP_CHROM identifier the value of bit_chrom (the fourth and fifth
- fields). The bit_str.hlp file in the Help sub-directory contains additional
- information the user may access via the F1 key. And the last field contains a
- zero indicating that there are no constraints for this component.
-
- 24
-
- Chromosome,Representation,1-D Bit String,REP_CHROM,bit_chrom,bit_str.hlp,0
-
- Chromosome,Decoding,Binary,CODE_CHROM,bit_chrom,binary.hlp,0
-
- Chromosome,Decoding,Gray,CODE_CHROM,gray_chrom,gray.hlp,0
-
- The entry below is for the Individual Reproduction. This is probably the most
- complicated entry because of the various constraints. But this is a good
- example to explain how the constraints work. All of the constraints in this
- case use the if operator. The system will evaluate each constraint with the if
- operator and if the condition is true it will evaluate the constraints below it
- that are precede by two tab characters. So the first two constraints mean that
- when the user selects individual reproduction if we are currently using the
- generational version of SUS (sus_gen) or GENESIS Rank (rank_sus_gen),
- respectively, that we should switch to the individual reproduction version of
- the selection technique. The third constraint has two if conditions. It means
- the if Linear Rank selection (rank_sel) and the Generational Emulation
- replacement policy (kill_term) are currently selected the we need to change the
- replacement policy to the Remove Worst technique (kill_worst) since
- Generational Emulation is not compatible with Linear Rank selection. And the
- final constraint is a little confusing. It says the if Linear Rank selection
- is being used then we need to change the selection component to Linear Rank.
- Now this seem silly. But the implies constraint (>) actually causes the
- program to evaluate the constraints for the component we what to change. So
- the program will evaluate the constraints for Linear Rank selection. One of
- these constraints is that if individual reproduction is being used then we also
- need the repl_rank component for the replacement phase which inserts new
- offspring into the population it the proper order according to their rank.
-
- Reproduction,None,Individual,POP_TYPE,pop_ind,ind.hlp,9,
-
- SELECT ?= sus_gen
-
- SELECT > sus_ind
-
- SELECT ?= rank_sus_gen
-
- SELECT > rank_sus_ind
-
- SELECT ?= rank_sel
-
- KILLER ?= kill_term
-
- KILLER > kill_worst
-
- SELECT ?= rank_sel
-
- SELECT > rank_sel
-
- The entries below are most of the entries for the Selection category. The
- first entry is for the generational version of SUS selection. The only
- constraint on this component is that the style for reproduction must be
- generational (pop_gen). The last entry is for Linear Rank selection with
- individual reproduction. The constraint says we must be using individual
- reproduction for this component to be compatible. The second constraint says
- the we can use Linear Rank selection as long as we are not using the
- Generational Emulation replacement policy (kill_term). And the final
- constraint says the to use this component we must also use the repl_rank class
- for the replacement phase so that the offspring will be inserted in the proper
- position within the population, according to their rank.
-
-
-
- Selection,None,SUS,SELECT,sus_gen,sus.hlp,1,
-
- POP_TYPE = pop_gen
-
- Selection,None,SUS,SELECT,sus_ind,sus.hlp,2,
-
- POP_TYPE = pop_ind
-
- REPLACE > repl_sus
-
- Selection,None,Linear Rank,SELECT,rank_sel,lrank.hlp,1,
-
- POP_TYPE = pop_gen
-
- Selection,None,Linear Rank,SELECT,rank_sel,lrank.hlp,3,
-
- POP_TYPE = pop_ind
-
- KILLER != kill_term
-
- REPLACE > repl_rank
-
- The entries below are for the Crossover category. The first entry is the item
- we added for the Uniform crossover technique. In addition, we needed to add
- several constraints to the next two entries (Traditional and Reduced Surrogate
- crossover, respectively). These constraints mean the if we select either
- Traditional or Reduced Surrogate crossover when Uniform crossover is the
- component that we are currently using, then we need to change the CROSS
- identifier to one_pt. That is, it is invalid for the CROSS_TYPE identifier to
- be associated with either n_pt (Traditional) or reduced_sur (Reduced Surrogate)
- when the CROSS identifier is set to uniform_xo (Uniform crossover). Since the
- user has selected either Traditional or Reduced Surrogate crossover we need to
- change the CROSS identifier to either one_pt or two_pt and by default the
- system will use one_pt.
-
- Crossover,Style,Uniform XO,CROSS,uniform_xo,unif_xo.hlp,0
-
- Crossover,Style,Traditional,CROSS_TYPE,n_pt,trad_xo.hlp,2,
-
- CROSS ?= uniform_xo
-
- CROSS > one_pt
-
- Crossover,Style,Reduced Surrogate,CROSS_TYPE,reduced_sur,surr_xo.hlp,2,
-
- CROSS ?= uniform_xo
-
- CROSS > one_pt
-
- Crossover,# Cut Points,One,CROSS,one_pt,none,0
-
- Crossover,# Cut Points,Two,CROSS,two_pt,none,0
-
- So to be able to select your new component through the user interface program
- you will need to add an entry to the gat.ini file. When you add your entry you
- should place it in the file with the other entries for that type of technique.
- For example if your created a new selection technique you should add the new
- entry in the area of the gat.ini file the has other entries for the selection
- category. In addition, you may need to modify the constraints for some the
- existing entries to ensure compatibility with your new component. And finally,
- do not forget to update the number on the first line of gat.ini file to
- accurately represent the number of component entries in the file.
-
- After you have modified the gat.ini file you can execute the user interface
- program and you should be allow to select your new component. There is no need
- to re-compile the user interface program since gat.ini is only an input file.
-
- {tc "Adding a New Parameter" \L 2}Adding a New Parameter
-
- This section describes how you can add a new input parameter for the GA
- programs that you create with the GA Toolkit. This is especially useful if you
- create a new component that needs a new parameter. For example, for the
- Uniform crossover technique that was recently added to the system needed a
- parameter to specify the probability for swapping each position on the
- chromosome. The following describes the process for adding a new parameter.
-
- 1. Declaring a Variable for the New Parameter
-
- The first step is to declare a new variable for your parameter. You will need
- to modify two files to declare your new variable and to provide the component
- access to your parameter. All user-defined input parameters for the GA Toolkit
- are declared as global variables so that all component will have access to the
- parameters they require.
-
- The global.h file in the Source sub-directory contains all the declaration of
- global variables. You will need to add a declaration for your new variable in
- this file. For example, we added the following for the swapping probability
- parameter.
-
-
-
- double SWAP_PROB;
-
-
-
- The second file you need to modify is extern.h (in the Source sub-directory).
- This file contains all the external declaration for the global variables. This
- is necessary so that all the components will have access to the proper
- variables. For example, we added the following external declaration for the
- swapping probability parameter.
-
-
-
- extern double SWAP_PROB;
-
-
-
- 2. Update the Input Section
-
- In this step you need to modify the input.h file (in the Source sub-directory).
- This file contains several #define directives used for reading the input
- parameters from the in file. The INPUT_FORM identifier is used for the fscanf
- format string. And the INPUT_VARS identifier is used as the list of input
- parameter variables in the fscanf call. So by updating the #define identifies
- in the input.h file the program will be able to read in your new parameter.
- The following shows a section of the input.h file where we updated the
- INPUT_FORM and INPUT_VARS #define identifiers.
-
-
-
- #define INPUT_FORM " \
-
- Total Experiments = %d \
-
- Total Trials = %d \
-
- Population Size = %d \
-
- DeJong Gap = %lf \
-
- Elite Gap = %lf \
-
- Prob. Cross = %lf \
-
- Prob. Mutation = %lf \
-
- Selection Bias = %lf \
-
- Replacement Bias = %lf \
-
- Maximum Spin = %d \
-
- Statistics Interval = %d \
-
- Hamming Metrics = %d \
-
- Best Size = %d \
-
- Worst Size = %d \
-
- Maximize = %d \
-
- Evaluate All = %d \
-
- Seed Population = %d \
-
- Debugging Level = %d \
-
- Display = %d \
-
- Print Generations = %d \
-
- Random Seed = %lu \
-
- Swapping Prob. = %lf"
-
-
-
-
-
- #define INPUT_VARS &MAX_EXPER, &MAX_TRIALS, &POP_SIZE, &GAP, \
-
- &ELITE_GAP, &P_CROSS, &P_MUTE, &SELECT_BIAS, &REPLACE_BIAS, \
-
- &MAX_SPIN, &STATS_GAP, &HAM_FLAG, &BEST_SIZE, &WORST_SIZE, &MAX_FLAG, \
-
- &EVAL_ALL, &INIT_FLAG, &DEBUG_FLAG, &DISPLAY_FLAG, &PRINT_GEN, &SEED, \
-
- &SWAP_PROB
-
-
-
- As you can see we just added our new "Swapping Prob." parameter to the end of
- the INPUT_FORM identifier and the variable SWAP_PROB to the end of the
- INPUT_VARS identifier. Note: The INPUT_VARS identifier is used as the list of
- argument to the fscanf function so the "&" in front of the variable name is
- necessary so the value will be actually be store in the variable after the
- fscanf function returns.
-
- 3. Update the Help File for Input Parameters
-
- The next step is to add a short description about your new parameter to the
- in.hlp file (in the Help sub-directory). This file contains information about
- all the input parameters. You should use a text editor to edit the in.hlp file
- and add some information about your new parameter.
-
- 4. Update the Parameter Initialization File for the User Interface Program
-
- The final step for adding a new parameter is to update an initialization file,
- called gat_in.ini in the Uif sub-directory, so that you can create input files
- with this new parameter via the user interface program. The gat_in.ini file
- contains information about the parameters so the user interface program can
- ensure that only appropriate values are entered for each parameter.
-
- The first line in the gat_in.ini file contains the number parameter entries in
- the file. The user interface program uses an array data structure for storing
- and manipulation the parameter information. The number on the first line of
- the file makes it easier to create this data structure. When you add
- information to this file for new parameters you will need to update this number
- to accurately reflect the number of entries in the file.
-
- Each remaining lines in the file contains information about one parameter.
- Each line or entry contains several comma delimited fields described below.
- The actual gat_in.ini file will be presented and explained later.
-
-
-
- Parameter Name - this is a name for the parameter. You must use exactly the
- same name you used in the INPUT_FORM identifier in the input.h file. For
- example, for the swapping probability parameter we used the name "Swapping
- Prob." so that is the value we must use for this field.
-
-
-
- Default Value - this the default value for the parameter that will initially be
- displayed in the user interface program where the user begins creating the in
- file.
-
-
-
- Type of Parameter - this field is used so that the validity of the user's input
- can be checked. The system currently understands the following entries for
- this field.
-
-
-
- 1 the value for this parameter must be greater than the value of the Minimum
- field.
-
-
-
- 2 the value for this parameter must be greater than or equal to the value of
- the Minimum field.
-
-
-
- 3 the value for this parameter must be an even integer.
-
-
-
- 4 the value for this parameter must be in the range [Minimum field, Maximum
- field].
-
-
-
- 5 the value for this parameter must be in the range (Minimum field, Maximum
- field].
-
-
-
- 6 the value for this parameter can be 0 or 1. This is for boolean flags. A
- value of 0 is displayed as "no" and a 1 is displayed as "yes" in the user
- interface program. The default value for this field must be 1 or 0 (not yes or
- no).
-
-
-
- Minimum - this field is only used with parameters of type 1, 2, 4 or 5.
-
-
-
- Maximum - this field is only used with parameters of type 4 or 5.
-
-
-
- The following shows the actual gat_in.ini file and explains several of the
- entries. The contents of the file is printed in a Courier font and additional
- comments are printed in a Times Roman font.
-
-
-
- 22
-
- Total Experiments,1,1,0
-
- The Total Trials parameter has a default value of 1000. This is a type 1
- parameter and the Minimum field is 0 which means that the value for this
- parameter must be greater than 0.
-
- Total Trials,1000,1,0
-
- Population Size,50,3
-
- DeJong Gap,1.0,4,0.0,1.0
-
- Elite Gap,1.0,4,0.0,1.0
-
- The Prob. Cross entry is for the cross probability parameter. It has a default
- value of 1.0. This is a type 4 parameter with a minimum of 0.0 and maximum of
- 1.0. So the values for this parameter must be in the following range [0.0,
- 1.0].
-
- Prob. Cross,1.0,4,0.0,1.0
-
- Prob. Mutation,0.001,4,0.0,1.0
-
- Selection Bias,1.5,5,1.0,2.0
-
- Replacement Bias,1.5,5,1.0,2.0
-
- Maximum Spin,3,1,0
-
- Statistics Interval,200,1,0
-
- Hamming Metrics,0,6
-
- Best Size,1,2,0
-
- Worst Size,3,2,0
-
- The Maximize parameter is boolean flag (type 6) parameter for whether the GA
- will be applied to a maximization or minimization problem. The default value
- is 1 (the flag is on) indicating a maximization problem.
-
- Maximize,1,6
-
- Evaluate All,0,6
-
- Seed Population,0,6
-
- Debugging Level,0,2,0
-
- Display,0,6
-
- Print Generations,0,6
-
- Random Seed,123456789,2,0
-
- The Swapping Prob. entry is what we added for the swapping probability
- parameter for the Uniform crossover technique. The default value is 0.5 or on
- average half the positions on the chromosome will be swapped. This a type 4
- parameter with a minimum of 0.0 and a maximum of 1.0. So this parameter can be
- assigned values in the following range [0.0, 1.0].
-
- Swapping Prob.,0.5,4,0.0,1.0
-
-
-
- When you add an entry for your new parameter it must be in the same relative
- position as you added the information in the input.h file. For example, we
- added the Swapping Prob. parameter to the end of the INPUT_FORM and INPUT_VARS
- #define identifiers in the input.h file so we must also add our entry for this
- parameter to the end of the list in the gat_in.ini file. You can arrange the
- parameters in
-
- any order you like also long as you maintain the same order in the gat_in.ini
- file and in the INPUT_FORM #define identifier (with corresponding variables in
- the INPUT_VARS #define identifier).
-
- After you have completed the necessary modification to the gat_in.ini file you
- can execute the user interface program and create an in input file with your
- new parameter.
-
- {tc "REFERENCES" \l 1}References
-
- Ackley, D. H. A Connectionist Machine for Genetic Hillclimbing. Boston, MA:
- Kluwer Academic Publishers, 1987.
-
- Baker, J. E. "Adaptive Selection Methods for Genetic Algorithms." In
- Proceedings of the First International Conference on Genetic Algorithms and
- Their Applications, ed. J. J. Grefenstette, 101-111. Hillsdale, NJ: Lawrence
- Erlbaum Associates, 1985.
-
- Baker, J. E. "Reducing Bias and Inefficiency in the Selection Algorithm." In
- Genetic Algorithms and Their Applications: Proceedings of the Second
- International Conference on Genetic Algorithms, ed. J. J. Grefenstette, 14-21.
- Hillsdale, NJ: Lawrence Erlbaum Associates, 1987.
-
- Booker, L. "Improving Search in Genetic Algorithms." In Genetic Algorithms
- and Simulated Annealing, ed. Lawrence Davis, 61-73. Los Altos, CA: Morgan
- Kaufmann Publishers, Inc., 1987.
-
- DeJong, K. A. "An Analysis of the Behavior of a Class of Genetic Adaptive
- Systems." Ph.D. dissertation, University of Michigan, 1975.
-
- Frederick, W., R. Sedlmeyer and C. White. "The Hamming Metric in Genetic
- Algorithms and Its Application to Two Network Problems." Computer Science
- Department, Indiana University-Purdue University at Fort Wayne, (unpublished
- manuscript), 1991.
-
- Goldberg, D. E. Genetic Algorithms in Search, Optimization, and Machine
- Learning. Reading, MA: Addison-Wesley Publishing Company, Inc., 1989.
-
- Grefenstette, J. J. GENESIS Software (Version 5.0). Distributed by The
- Software Partnership, Melrose, MA, 1990.
-
- Holland, J. H. Adaptation in Natural and Artificial Systems. Ann Arbor, MI:
- The University of Michigan Press, 1975.
-
- Lienau, S. A. Effective Investigation of Genetic Algorithms. Master's Thesis,
- University of Missouri-Kansas City, 1992.
-
- Syswerda, G. "Uniform Crossover in Genetic Algorithms." In Proceedings of the
- Third International Conference on Genetic Algorithms, ed. J. D. Schaffer, 2-9.
- San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1989.
-
- Whitley, D. "The GENITOR Algorithm and Selection Pressure: Why Rank-Based
- Allocation of Reproductive Trials is Best." In Proceedings of the Third
- International Conference on Genetic Algorithms, ed. J. D. Schaffer, 116-121.
- San Mateo, CA: Morgan Kaufmann Publishers, Inc., 1989.
-
- Whitley, D. "Fundamental Principles of Deception in Genetic Search." In
- Foundations of Genetic Algorithms, ed. G. J. E. Rawlins, 221-241. San Mateo,
- CA: Morgan Kaufmann Publishers, Inc., 1991.
-
-
-
-
-
- {PAGE|33}
-
-
-
- {PAGE|i}
-
-
-
- {PAGE|55}
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- {PAGE|57}
-
-
-
- {PAGE|56}
-
-
-
-
-
-